Add support for parsing digraphs#158
Closed
HaydenMichel8 wants to merge 5 commits into
Closed
Conversation
… a replacement for [
virtuald
reviewed
Jun 26, 2026
virtuald
left a comment
Member
There was a problem hiding this comment.
Thanks for the contribution! Looking at https://en.cppreference.com/cpp/language/operator_alternative, they have this example which your PR fails to parse:
%:include <iostream>
struct X
<%
compl X() <%%> // destructor
X() <%%>
X(const X bitand) = delete; // copy constructor
// X(X and) = delete; // move constructor
bool operator not_eq(const X bitand other)
<%
return this not_eq bitand other;
%>
%>;
int main(int argc, char* argv<::>)
<%
// lambda with reference-capture:
auto greet = <:bitand:>(const char* name)
<%
std::cout << "Hello " << name
<< " from " << argv<:0:> << '\n';
%>;
if (argc > 1 and argv<:1:> not_eq nullptr)
greet(argv<:1:>);
else
greet("Anon");
%>I think it's fine to not support the weird include thing or named versions of operators like bitand, but if you replace those it still gets stuck on the char* argv<::>.
I asked GPT to solve the digraph problem, I think it's approach is slightly better and parses these cases (see #160)
| """<% %> should work as { } in a function body context (body is skipped).""" | ||
| # The parser skips function bodies but must recognise the braces. | ||
| content = """\ | ||
| #include <iostream> |
Member
There was a problem hiding this comment.
For any future contributions, please use python -m cxxheaderparser.gentest to generate tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
C++ allows for digraphs as a replacement for certain tokens. While trigraphs were removed digraphs are still valid in modern C++. Namely, the following replacements can be used because some computer keyboards don't have all these characters or something:
<:replaces[:>replaces]<%replaces{%>replaces}So this means that something like
Is valid C++ code ( verified as compiling on https://godbolt.org/ but giving a parse error on https://robotpy.github.io/cxxheaderparser/ )
This change to support parsing for these tokens will allow cxxheaderparser to parse legacy C or C++ code that uses these weird symbols.